claim 1
- North America > United States (0.04)
- Europe > Czechia > Prague (0.04)
6 APPENDIX
This appendix provides the mathematical proofs of the theoretical results and additional experiment results of our paper "Neuron Merging: Compensating for Pruned Neurons," accepted at 34th The overall derivation is the same as Section 3.1. If there exists a column with more than one strictly positive entry, then Eq. 13a does not For simple notation, subscript i is omitted. In Table 4, we present the test results of VGG-16 and ResNet-34 on ImageNet. ResNet-34, we prune all convolution layers in equal proportion. For example, '1-1' indicates the first pruned layer in the first residual block.
6 APPENDIX
This appendix provides the mathematical proofs of the theoretical results and additional experiment results of our paper "Neuron Merging: Compensating for Pruned Neurons," accepted at 34th The overall derivation is the same as Section 3.1. If there exists a column with more than one strictly positive entry, then Eq. 13a does not For simple notation, subscript i is omitted. In Table 4, we present the test results of VGG-16 and ResNet-34 on ImageNet. ResNet-34, we prune all convolution layers in equal proportion. For example, '1-1' indicates the first pruned layer in the first residual block.
Inductive Bias and Spectral Properties of Single-Head Attention in High Dimensions
Boncoraglio, Fabrizio, Erba, Vittorio, Troiani, Emanuele, Krzakala, Florent, Zdeborová, Lenka
We study empirical risk minimization in a single-head tied-attention layer trained on synthetic high-dimensional sequence tasks, given by the recently introduced attention-indexed model. Using tools from random matrix theory, spin-glass physics, and approximate message passing, we derive sharp asymptotics for training and test errors, locate interpolation and recovery thresholds, and characterize the limiting spectral distribution of the learned weights. Weight decay induces an implicit nuclear-norm regularization, favoring low-rank query and key matrices. Leveraging this, we compare the standard factorized training of query and key matrices with a direct parameterization in which their product is trained element-wise, revealing the inductive bias introduced by the factorized form. Remarkably, the predicted spectral distribution echoes empirical trends reported in large-scale transformers, offering a theoretical perspective consistent with these phenomena.
- North America > United States (0.14)
- Europe > Switzerland > Vaud > Lausanne (0.04)
- Africa > Middle East > Tunisia > Ben Arous Governorate > Ben Arous (0.04)
PatentScore: Multi-dimensional Evaluation of LLM-Generated Patent Claims
Yoo, Yongmin, Xu, Qiongkai, Cao, Longbing
High-stakes texts such as patent claims, medical records, and technical reports are structurally complex and demand a high degree of reliability and precision. While large language models (LLMs) have recently been applied to automate their generation in high-stakes domains, reliably evaluating such outputs remains a major challenge. Conventional natural language generation (NLG) metrics are effective for generic documents but fail to capture the structural and legal characteristics essential to evaluating complex high-stakes documents. To address this gap, we propose PatentScore, a multi-dimensional evaluation framework specifically designed for one of the most intricate and rigorous domains, patent claims. PatentScore integrates hierarchical decomposition of claim elements, validation patterns grounded in legal and technical standards, and scoring across structural, semantic, and legal dimensions. In experiments on our dataset which consists of 400 Claim1, PatentScore achieved the highest correlation with expert annotations ($r = 0.819$), significantly outperforming widely used NLG metrics. This work establishes a new standard for evaluating LLM-generated patent claims, providing a solid foundation for research on patent generation and validation.
- North America > United States (0.28)
- North America > Mexico > Mexico City > Mexico City (0.04)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Last-Iterate Convergence of No-Regret Learning for Equilibria in Bargaining Games
Kamp, Serafina, Liebman, Reese, Fish, Benjamin
Bargaining games, where agents attempt to agree on how to split utility, are an important class of games used to study economic behavior, which motivates a study of online learning algorithms in these games. In this work, we tackle when no-regret learning algorithms converge to Nash equilibria in bargaining games. Recent results have shown that online algorithms related to Follow the Regularized Leader (FTRL) converge to Nash equilibria (NE) in the last iterate in a wide variety of games, including zero-sum games. However, bargaining games do not have the properties used previously to established convergence guarantees, even in the simplest case of the ultimatum game, which features a single take-it-or-leave-it offer. Nonetheless, we establish that FTRL (without the modifications necessary for zero-sum games) achieves last-iterate convergence to an approximate NE in the ultimatum game along with a bound on convergence time under mild assumptions. Further, we provide experimental results to demonstrate that convergence to NE, including NE with asymmetric payoffs, occurs under a broad range of initial conditions, both in the ultimatum game and in bargaining games with multiple rounds. This work demonstrates how complex economic behavior (e.g. learning to use threats and the existence of many possible equilibrium outcomes) can result from using a simple learning algorithm, and that FTRL can converge to equilibria in a more diverse set of games than previously known.
- North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.04)
The final solution of the Hitchhiker's problem #5
Omladič, Matjaž, Vuk, Martin, Zalar, Aljaž
The recent survey [2] nicknamed "Hitchhiker's Guide" has ra ised the rating of quasi-copula problems in the dependence model ing community in spite of the lack of statistical interpretation of quasi-co pulas. In our previous work we addressed the question of extreme values of the mass d istribution associated with a mutidimensional quasi-copulas. Using li near programming approach we were able to settle [2, Open Problem 5] up to d = 17 and disprove a recent conjecture from [25] on solution to that problem. In this note we use an analytical approach to provide a complete answer to the or iginal question.
- Europe > Slovenia > Central Slovenia > Municipality of Ljubljana > Ljubljana (0.04)
- Europe > Netherlands (0.04)
Enriching Patent Claim Generation with European Patent Dataset
Jiang, Lekang, Li, Chengzu, Goetz, Stephan
Drafting patent claims is time-intensive, costly, and requires professional skill. Therefore, researchers have investigated large language models (LLMs) to assist inventors in writing claims. However, existing work has largely relied on datasets from the United States Patent and Trademark Office (USPTO). To enlarge research scope regarding various jurisdictions, drafting conventions, and legal standards, we introduce EPD, a European patent dataset. EPD presents rich textual data and structured metadata to support multiple patent-related tasks, including claim generation. This dataset enriches the field in three critical aspects: (1) Jurisdictional diversity: Patents from different offices vary in legal and drafting conventions. EPD fills a critical gap by providing a benchmark for European patents to enable more comprehensive evaluation. (2) Quality improvement: EPD offers high-quality granted patents with finalized and legally approved texts, whereas others consist of patent applications that are unexamined or provisional. Experiments show that LLMs fine-tuned on EPD significantly outperform those trained on previous datasets and even GPT-4o in claim quality and cross-domain generalization. (3) Real-world simulation: We propose a difficult subset of EPD to better reflect real-world challenges of claim generation. Results reveal that all tested LLMs perform substantially worse on these challenging samples, which highlights the need for future research.
- North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Italy > Tuscany > Florence (0.04)
- (2 more...)